Finding and Visualizing Subspace Clusters of High Dimensional Dataset Using Advanced Star Coordinates
نویسنده
چکیده
Analysis of high dimensional data is a research area since many years. Analysts can detect similarity of data points within a cluster. Subspace clustering detects useful dimensions in clustering high dimensional dataset. Visualization allows a better insight of subspace clusters. However, displaying such high dimensional database clusters on the 2-dimensional display is a challenging task. We proposed an ISC-ASC approach which first identifies subspace clusters in a high dimensional dataset and then display these clusters on a 2-dimensional display device. Algorithm ISC detects the subspace clusters using a density notion of clustering. Algorithm ASC visualizes these subspace clusters. In ASC instead of considering all the dimensions, the dimensions which are taking part in subspace clustering are considered to find the projection points. ISC-ASC is beneficial for users to identify subspace clusters. Visualizing these subspace clusters using ASC have efficient knowledge discovery which helps to take decision about the quality of subspace clusters. Keywords—Subspace clustering, high dimensional data subspace clustering, visualization
منابع مشابه
A Preview on Subspace Clustering of High Dimensional Data
When clustering high dimensional data, traditional clustering methods are found to be lacking since they consider all of the dimensions of the dataset in discovering clusters whereas only some of the dimensions are relevant. This may give rise to subspaces within the dataset where clusters may be found. Using feature selection, we can remove irrelevant and redundant dimensions by analyzing the ...
متن کاملVisualizing High-density Clusters in Multidimensional Data
The analysis of multidimensional multivariate data has been studied in various research areas for many years. The goal of the analysis is to gain insight into the specific properties of the data by scrutinizing the distribution of the records at large and finding clusters of records that exhibit correlations among the dimensions or variables. As large data sets become ubiquitous but the screen ...
متن کاملClustering for High Dimensional Data: Density based Subspace Clustering Algorithms
Finding clusters in high dimensional data is a challenging task as the high dimensional data comprises hundreds of attributes. Subspace clustering is an evolving methodology which, instead of finding clusters in the entire feature space, it aims at finding clusters in various overlapping or non-overlapping subspaces of the high dimensional dataset. Density based subspace clustering algorithms t...
متن کاملVisual Hierarchical Dimension Reduction
Traditional visualization techniques for multidimensional data sets, such as parallel coordinates, star glyphs, and scatterplot matrices, do not scale well to high dimensional data sets. A common approach to solve this problem is dimensionality reduction. Existing dimensionality reduction techniques, such as Principal Component Analysis, Multidimensional Scaling, and Self Organizing Maps, have ...
متن کاملAn Efficient Method for Finding Closed Subspace Clusters for High Dimensional Data
Subspace clustering tries to find groups of similar objects from the given dataset such that the objects are projected on only a subset of the feature space. It finds meaningful clusters in all possible subspaces. However, when it comes to the quality of the resultant subspace clusters most of the subspace clusters are redundant. These redundant subspace clusters don’t provide new information. ...
متن کامل